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Optimal Sequential Vector Estimation 

Yasin Yilmaz^, George V. Moustakidest, and Xiaodong Wang^ 



I. Problem Formulation and Background Information 

We represent scalars with lower-case letters, vectors with upper-case letters and matrices with upper- 
case bold letters. Consider the following linear signal model, 

y t = HfX + w t ,te N, (1) 

where y t G R is the observed sample, X S 1™ is the deterministic but unknown vector of parameters 
to be estimated, H t £ W 1 is the random vector of scaling coefficients (e.g., channel gain vector in a 
multiaccess channel) and wt € E is the additive noise. We observe, at each time t, the sample y t and 
the coefficient vector H t . Hence, at each time t, {y T ,H T Y T=l are known. We assume {w t } are i.i.d. with 
E[w t ] = and \/ar(w t ) = a 2 . 
The ordinary least squares (OLS) estimator minimizes the sum of squared errors, i.e., 

t 

X t = arg min J> T ~ X)\ (2) 



x 

T=l 



and given by 



Xt=[Y J H r R r E H ^ = (HfHt^HfYt, (3) 

\r=l / r=l 

where H t = [H x , . . . , H t ] T and Y t = [y 1 ,..., y t } T . 

Under Gaussian noise, u> t ~ M(0,a 2 ), the OLS estimator coincides with the minimum variance 
unbiased estimator (MVUE). That is to say, the OLS estimator achieves the Cramer-Rao lower bound 
(CRLB), i.e., Cov(X t \H t ) = CRLB. To compute the CRLB we first write, given X and H t , the log- 
likelihood of the vector Y t as 

L t = log f(Y t \X, H t ) = -Y j {Vt -*J X? - I l g(2vra 2 ). (4) 

T=l 
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Then, we have 

d 2 

CRLB 



-L t \H t 



o 2 Ut\ (5) 



where E 



82 L t \H t 



dX 2 

is the Fisher information matrix and we defined U t = HjH t . Since E[Y t \H t ] = 
H t X and Cov(F t |JT t ) = a 2 I, from © we have E[X t \H t ] = X and Cov(X t \H t ) = cr 2 ^ 1 , thus from 
© Cov(X t \H t ) = CRLB. Note that the maximum likelihood estimator, maximizing dU, coincides with 
the OLS estimator in ©. 

In general, under a non-Gaussian distribution the OLS estimator is the best linear unbiased estimator 
(BLUE). In other words, any linear unbiased estimator A t Y t , where E[A t F t |i3" t ] = X for any A t G R nxt , 
has a covariance not smaller than that of the OLS estimator in ©, i.e., Cov(A t Y t \H t ) > ^U^ 1 in the 
positive semidefinite sense. To see this result we write A t = (Hf H t )~ 1 Hf + B t for some B t G M nx *, 
and then Cov(A t Y t \H t ) = a 2 U^ 1 + a 2 B t Bf, where B t Bf is a positive semidefinite matrix. 

The recursive least squares (RLS) algorithm enables us to compute X t in a much simpler way than 
©, which requires a matrix inversion at each time t. Using RLS, at each time t, we can update X t as 

X t = X t _ l + K t (y t -H?X t _ l ) 

P H ( 6 ) 

where K t = *=^J and P t = Pt-i — K t HTP t -u 

1 + HjPt-xHt 

K t G M. n being the gain vector and P t = Uj 1 . While applying RLS we first initialize Xq = and 
Po = 5~ l I, where represents a zero vector and 5 is a small number, and then at each time t compute 
K t , X t and P t as in ©, respectively. 

II. Optimal Sequential Estimators 

In this section we aim to find the optimal pair (T, Xf) of stopping time and estimator, giving us the 
optimal sequential estimator. The stopping time for an estimator is selected as the first time it achieves 
a target accuracy level. We assess the accuracy of an estimator by using either its covariance matrix 
Qov{X t ) or conditional covariance matrix Cov(X t \H t ). Specifically, we have the following constrained 
optimization problems, 

minE[T|i?r] such that / (Cov(X T \H T )) < C, (7) 

T,X T V ' 



and min E[T] such that / Cov(X r ) < C, (8) 

T,X T V ' 

under the conditional and unconditional setups, respectively, where /(•) is a function from M. nxn to M 
and C G M is the target accuracy level. 
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Note that the constraint in d7J is stricter than the one in (|8) since it requires that X-f satisfies 
the target accuracy level for each realization of Hj-, whereas in ((8]) it is sufficient that Xf satisfies 
the target accuracy level on average. In other words, even if for some realizations of Hj- we have 
/ (Cov(Xj-\H'r)\ > C, we can still have / (Cov(X-]-)^ < C. Moreover, since we observe discrete- 
time samples, i.e., t £ N, with each realization of Hj- the accuracy function / ^Cov(X-7-|i3"7-)^ in 
general undershoots C, whereas we can always have / ^Cov(X-7-)^ = C. Hence, the objective function 
E[T] in ([8]) will in general take smaller values than the objective function E[T\Hj-} in ©. If we observed 
continuous-time processes with continuous paths, then we could also have / ^Cov(X-7-|i3"7-)^ = C for 
each realization of Hq-, and thus the objective functions could take the same value. 

The accuracy function / should be a monotone function of the covariance matrices Cov(X-j-\H-j-) and 
Cov(X7") in order to make fair accuracy assessments. Two popular and easy-to-compute choices are the 
trace Tr(-) and the Frobenius norm || • \\p. We will next deal with ^} and ([8]) separately. 



A. Conditional Problem 

It is known that, in general, with an unconstrained stopping time the sequential CRLB is not attainable 
under any kind of noise (Gaussian or non-Gaussian) except Bernoulli-distributed-noise [T|. We will next 
show that, with a stopping time T that only depends on Hj-, the OLS estimator attains the sequential 
CRLB, i.e., Xj- is the sequential MVUE, under Gaussian noise and it is also the sequential BLUE 
under non-Gaussian noise. Denote the sigma-algebra and the filtration generated by the coefficient vectors 
H\, . . . , H t with T-L t and {Ht}, respectively. Similarly denote the sigma-algebra and the filtration generated 
by the sample yi, . . . , y t with Ft and {Ft], respectively. Then, we are interested in {% t }-adapted stopping 
times. Note that an unconstrained stopping time could in general be {Ft U % t }-adapted, for which 
unfortunately we know that there is no optimal sequential estimator. 

Lemma 1. Having a monotone accuracy function f and an {Ht} -adapted stopping time T we can write, 
for the constraint in 

f (cov(! r |fr r )) > / {^Uj- 1 ) (9) 

for all unbiased estimators under Gaussian noise, and for all linear unbiased estimators under non- 
Gaussian noise. And the OLS estimator satisfies this inequality with equality. 

Proof: In the previous section, the OLS estimator was shown to be MVUE under Gaussian noise 
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and BLUE under non-Gaussian noise. It was also shown that Cov(X t \H t ) = cr 2 U t . Hence, we write 



/ (Cov(X T \H T ) )=f\ E 



J2(Xt-X)(X t -X) T l {t=T} \H t 
J=i J / 

= / l^2E[(X t -X)(X t -X) T \H t ] l {t=T }^ (10) 

= / {a 2 U^) , (12) 

for all unbiased estimators under Gaussian noise and for all linear unbiased estimators under non-Gaussian 
noise. We used the fact that the event {T = t} is T^-measurable and E[(X t — X){X t — X) T \H t ] = 
Cov(X t \H t ) > (J 2 U^ X to write flO]) and (HB, respectively. ■ 
Since T is {^ 4 }-adapted, we have E[T|-H"t] = 7*> an d thus from d7J we want to find the first time 
that a member of our class of estimators (i.e., unbiased estimators under Gaussian noise and linear 
unbiased estimators under non-Gaussian noise) satisfies the constraint / ^Cov(X7-|i3"7-)^ < C, and also 
the estimator that attains this earliest stopping time. From Lemma Q] it is seen that the OLS estimator 
achieves the earliest stopping time among its competitors. Hence, for the conditional problem the optimal 
pair of stopping time and estimator is (T, Xf) where T is given by 

T = min{t G N : / (a 2 ^ 1 ) <C}, (13) 

and from ©, X T = U r l H\Y T , which can be computed recursively as in ©. The recursive computation 
of U^ 1 in the test statistic in ( fTBl is also given in ©. Note that for an accuracy function / such that 
f{a 2 U^ 1 ) = a 2 f(U'j~ 1 ), e.g., Tr(-) and || • we can use the following stopping time, 

T = min{t G N : / (U^ 1 ) < C'}, (14) 

where C = C/a 2 is the relative target accuracy with respect to the noise power. Hence, given C we do 
not need to know the noise variance a 2 to run the test given by (fT4l . 

Note that Ut, being the summation of covariance matrices up to time t, is a non-decreasing positive 
semidefinite matrix, and thus, from the monotonicity of /, the test statistic / {a 2 U^ 1 ^ is a non-increasing 
scalar function of time. Specifically, for accuracy functions Tr(-) and || • \\p we can show that if the 
minimum eigenvalue of Ut tends to infinity as t — > oo, then the stopping time is finite, i.e., T < oo. 

For the special case of scalar parameter estimation, we do not need a function / to assess the accuracy 
of the estimator since instead of a covariance matrix we now have a variance where m = Y? T =i 
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and h t is the scaling coefficient in (Q]). Hence, from (fl4b the stopping time in the scalar case is given by 

T = min jt e N : u t > , (15) 
where % is the Fisher information at time t. This result is in accordance with [f2] Eq. (3)]. 

B. Unconditional Problem 

In this case we assume {H t } is i.i.d.. From the constrained optimization problem in (H), using a 
Lagrange multiplier A we obtain the following unconstrained optimization problem, 



min E[T] + A/ Cov(X r ) . (16) 

T,X T v / 

We are again interested in {% t }-adapted stopping times to use the optimality property of the OLS 
estimator in the sequential sense. For the sake of simplicity assume a linear accuracy function / so 
that /(E[-]) = E[/(-)], e.g., the trace function Tr(-), which is also monotone. Then, our constraint 
function turns out to be the sum of the individual variances, i.e., Tr ^Cov(X-7-)^ = Ya=1 Var(x^). Since 
Tr (Cov(X r )) = Tr (e Cov(X t \H t ) ) = E [Tr (Cov{X T \H T 

min E 

T,X T 



T + ATr [Cov(X T \H T ] 



, we rewrite (1161 1 as 

(IV) 



where expectation is with respect to H -j. 

From Lemma [TJ we have Tr ^Cov(X-7-|i3"7-)^ > Tr (cr 2 ?/^ 1 ) where a 2 U^ x is the covariance matrix 
of the OLS estimator at time t. Note that Ut/cr 2 is the Fisher information matrix at time t [cf. ©]. 
Using the OLS estimator we minimize the objective function in ( TT7T ), hence X-r = U^HjYj- [cf. © 
for recursive computation] is the optimal estimator also in the unconditional problem. 

Now, to find the optimal stopping time we need to solve the following optimization problem, 

min E [T + ATr {o- 2 U^)] , (18) 



which can be solved by using optimal stopping theory. Writing (1181 ) in the following alternative form 

rr-i 

1 + ATr (a 2 U T l ) 



min E 

r 



t=o 



(19) 



we see that the term Y^[=o ^ accounts for the cost of not stopping until time T and the term ATr (a 2 U^- l ^j 
represents the cost of stopping at time T ■ Note that Ut = Ut-i + H t Hf and given Ut-\ the current 
state Ut is (conditionally) independent of all previous states, hence {Ut} is a Markov process. That is, 
optimal stopping time for a Markov process is sought in (|T9V From [3| the solution is given by 

V(U) = min{ATr {^U' 1 ) , 1 + E[V(U + HtHf^U]}, (20) 
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where expectation is with respect to Hi and V is the optimal cost function. The optimal cost function 
is found by iterating a sequence of functions {V n } where V(U) = hm n _>oo V n (U) and 

V n {U) = min{ATr (a 2 ^" 1 ) , 1 + E[V n ^{U + HiH[)\U]}. (21) 



In optimal stopping theory, the original complex optimization problem in (1181 1 is divided into simpler 
subproblems given by (|20l i. At each time t we are faced with a subproblem consisting of a stopping cost 
F(Ut) = ATr (a 2 U~i~ l ) and an expected sampling cost G(Ut) = 1 + E[V (U t+i)\U t] to proceed to time 
t + 1. The optimal cost function V(Ut), selecting the action with minimum cost (i.e., either continue 
or stop), determines the optimal policy to follow at each time t. Specifically, the optimal policy, as we 
will show later in this section, chooses to continue as long as V{Ut) = G{Ut) and stops the first time 
V{Ut) = F(Ut). We need to analyze the structure of V(Ut), i.e., the cost functions F(Ut) and G{Ut), 
to show such a behavior for the optimal policy and find the optimal stopping time T ■ 

Note that V, being a function of the symmetric matrix U G M nXT \ is a function of n2 + n variables {uij : 
i < j} where U = [uij]. Analyzing a multi-dimensional optimal cost function proves intractable, hence we 
will first analyze the special case of scalar parameter estimation and then provide some numerical results 
for the two-dimensional vector case, demonstrating how intractable the higher dimensional problems are. 

For the scalar case, from (l2Cfl i we have the following one-dimensional optimal cost function, 

V{u) = mm j^,l + E[V> + /i 2 )]j , (22) 

where expectation is with respect to h\ and h\ is a scalar coefficient, scaling the parameter x to be 
estimated [cf. (JT)]. Write V as a function of w = 1/u, 



V(w) = min i \a 2 w, 1 + E 



V 



if 



(23) 



where as before expectation is with respect to hi. We need to analyze the cost functions F(w) = Xa 2 w 
and G(w) = 1 + E V ( 1 _^ uh -i J ■ The former is a line, whereas the latter is in general a nonlinear 
function of w. We have the following theorem regarding the structure of V(w) and G(w). Its proof is 
given in Appendix. 

Theorem 1. The optimal cost V and the expected sampling cost G, given in (123b . are non-decreasing, 
concave and bounded functions of w. 

The cost functions F(w) and G(w) are continuous functions as F is linear and G is concave. From 
(1231 we have V(0) = min{0, 1 + V(0)} = 0, hence G(0) = 1 + V(0) = 1. Then, using Theorem [T] we 
show F(w) and G(w) in Fig. [T] The optimal cost function V(w), being the minimum of F and G [cf. 
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Fig. 1. The structures of the optimal cost function V(w) and the cost functions F(w) and G(w). 



d23l)l, is also shown in Fig. [TJ justifying Theorem 1. Note that as t increases w tends from infinity to 
zero. Hence, we continue until the stopping cost F(wt) is lower than the expected sampling cost G(wt), 
i.e., until wt < C" . In other words, the stopping time in the scalar case of the unconditional problem is 
given by 

1 



T = min { t G N : u t > 



C" 



(24) 



similar to the scalar case of the conditional problem [cf. (|15Y l. Note that the threshold C is determined 
by the Lagrange multiplier A, which is selected so that E ^~ = C, i.e., the variance of the estimator 
exactly hits the target accuracy level C, [cf. ([Toll. Accordingly, we have C" > Cja 2 = C since the 
upper bound C on the conditional variance a 2 wj- is also an upper bound for the variance E[a 2 wj-]. 
This result implies that the stopping time of the unconditional problem will be smaller than that of the 
conditional problem. 

We will next show that the multi-dimensional cases are intractable by providing some numerical results 
for the two-dimensional case. In the two-dimensional case, from (|20T > the optimal cost function is written 

as 



V(un,ui2,«22) = niin< Act 



where U 



J2 Ull + U 2 2 
U n U 2 2 ~ uj 2 



, 1 + E [V(uu + hl A ,u 12 + h lt ihi, 2 ,u 2 2 + h\ 2 )] | 



«11 


Ul2 


, H 1 = 


h,i 






U\2 


U22 




hi, 2 



(25) 



and expectation is with respect to /ii i and h\ >2 - Changing variables we can write V as a function of 
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Fig. 2. The surface that defines the stopping rule for A = 1, a 2 = 1 and /u,i, fti,2 ~ -<V(0, 1) in the two-dimensional case. 



ion = 1/uii, w 2 2 = l/«22 and p = ltl 2 /V«ll«22» 

t r/ \ • f, 2^11+^22 

K (wn , W22 , p) = mm <^ Act — 5— , 



1 + E 



V" 



ten 



W22 



P + h 1 ^h 1) 2^/WllW22 



(26) 



1 + W n hl A ' 1 + IWaa^ ' ^(l + u , 11 / i 2 1 )(l + u , 22 / l 2 2 

Denote the first term inside the min function in d2"7T ) with F(w u , w 2 2, p) and the second term with 
G(w\\, W22, p)- Then, we can iteratively compute 



V n (wii,w 2 2,p) = min <^ Xa 



1 + E 



2 Wn + W22 



1 



V n - 



P* 



W22 



p + hi : ihi :2 ^w n w22 



, (27) 



l + w u h 1:1 1 + W2 2 h 1:2 J(l + Wn hl l )(l+w 22 hl 2 j 

where lim n ^oo V n = V . Note that p is the correlation coefficient, hence we have p G [—1,1]. Assuming 
a 2 = 1 and foi l, /112 ~ AA(0, 1) we numerically compute V and find the region that defines the stopping 
rule for A G {0.01, 1, 100}. For A = 1, the dome-shaped surface in Fig. |2] separates the stopping region 
from the continuation region. Outside the "dome" V = G, thus we continue. As time progresses w\\ and 
^22 decrease, so we move towards the "dome". And whenever we are inside the "dome", we stop, i.e., 
V = F. 

We obtain similar dome-shaped surfaces for different A values. However, the cross-sections of the 
"domes" at specific p t values differ significantly. In particular, we investigate the p t = case, where the 
scaling coefficients h ti \ and ht,2 are uncorrected. For small values of A, e.g., A = 0.01, the curve that 
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Fig. 3. The stopping regions for p t — 0, a 2 = 1 and ht,i, ht,2 ~ A/"(0, l),Vt in the unconditional problem with (a) A = 0.01, 
(b) A = 1, (c) A = 100. That of the conditional problem is also approximately shown in (c). 



separates the stopping and the continuation regions is highly nonlinear as shown in Fig. Ha). In Fig. [3^b) 
and Sc), it is seen that the separating curve tends to become more and more linear line as A increases. 

Now let us explain what small or large A means. Firstly, note from (l27T i that F and G are functions 
of n?t,ii, wt,22 f° r fixed p t , and the separating curve is the solution of the equality F(X, Wt,u, Wt,22) = 
G(wt ; ii,Wt,22)- When A is small, the region where F < G, i.e., the stopping region, is large, hence 
we stop early as shown in Fig. |3ta) Q. Conversely, for large A the stopping region is small, hence the 
stopping time is large [cf. Fig. Hfc)]. 

In fact, the Lagrange multiplier A is selected via simulations so that we satisfy, on average, the target 
accuracy level, i.e., E a 2 WT • 11 ^ w i T = Q j n other words, in the unconditional problem we need to 
numerically find the stopping rule, i.e., stopping and continuation regions, before the optimal sequential 
estimation algorithm starts. This becomes a quite intractable task as the dimension of the vector to be 
estimated increases since computation of G in (|27T ) in the n-dimensional case requires to compute an re- 
dimensional integral. On the other hand, in the conditional problem we have a simple stopping rule given 
in (TT3T ), which uses the target accuracy level C as its threshold, hence known beforehand for any re. Specif- 
ically, in the two-dimensional case the stopping time is given by T = min 

which for p t = reads T = minjt G N : Wt,n + Wt,22 < ^r}- m this case, the stopping rule is a 



'Note that the axis scales in Fig.[5fa) are on the order of hundreds and tut,ii, w t ,22 decrease as t increases. 
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function of wt,n + ^t,22, hence characterized by a line as shown in Fig. 0c). In Fig. |3]c) where a 2 = 1, 
the stopping region of the conditional problem is approximately shown to be smaller than that of the 
unconditional problem due to the same reasoning in the scalar case. 

?? Asymptotically G is a function of Wt,u + wt,22 as 11^,11,1^,22 — > 0, i.e., t — > 00 ?? 



III. Decentralized Implementation for the Conditional Problem 

The FC needs to compute the test statistic Tr(J7 f _1 ) where U t = Y^k=\ U t- Write U t = D t + E t 
where D t and E t are matrices that contain the diagonal and off-diagonal entries of Ut, respectively. 

If the scaling coefficients are uncorrelated for all k, then from Law of Large Numbers TrfJJ^ 1 ) = 
Tr(Z)^ 1 ) for sufficiently large t. Hence, sensors need to send only the diagonal entries of Jj\, decreasing 
the complexity considerably. 

In correlated case, we write 

= Jr([D 1 t /2 R t D 1 t /2 }-\ (28) 

If the correlation matrices for each sensor are known, then R is known where R t — > R by Law of Large 
Numbers. If the variances of each scaling coefficient are the same for all sensors, then the FC needs to 
know only the correlation coefficients for each sensor. Hence, again it is sufficient that sensors send only 
the diagonal elements of \j\ for a good approximation of the test statistic Tr{UJ ). 
Simulations 



Appendix: Proof of Theorem Q] 
We will first prove that if V(w) is non-decreasing, concave and bounded, then so is G(w) = 1 + 



E 



V 



If V(w) is non-decreasing, concave and bounded, we have the following identities: (il) 



l+wh'{ 

^V(w) > 0, (i2) ^pV(w) < 0, (i3) V(w) < c < oc,\/w, respectively. From (i3) we have 

f (tt^) <c ' v ^ (29) 

hence V ( 1+ ^ h i J is bounded. Taking the first derivative of V ( 1+ ^ h i J we write 



dw V \l + wh\) (l + whlY' (30) 
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where ^V(w) > from (il), making the right-hand side non-negative, and thus V ^ j is non " 
decreasing. To show concavity take the second derivative 



dw 2 



V 



w 



d 2 



V{w) 



+ 



_d_ 

Aw 



V(w) 



(31) 



where ^2 V(w) < from (i2), hence the first term on the right-hand side is negative. The second term is 
also non-positive since -^V(w) > from (il) and w = 1/u > 0. Hence, -^iV (r^jjq) < 0, meaning 
that V ( 1 ^ t]h -2 ^ is concave. Note that expectation is in fact a weighted sum and weighted sum of non- 
decreasing, concave and bounded functions are also non-decreasing, concave and bounded. Finally, taking 
the expectation and then and adding a constant we have a non-decreasing, concave and bounded function 
G(w), concluding the first part of the proof. 

Now, it is sufficient to show that V{w) is non-decreasing, concave and bounded. Assume that the 
limit lim n _ i , 00 V n {w) = V{w) exists. We will prove the existence of the limit after showing that V(w) 
is non-decreasing, concave and bounded. First, we will show that V(w) is non-decreasing and concave 
by iterating the functions {V n (w)}. Start with Vq(w) = 0. Then, 



Vi(w) = min < Xa 2 w, 1 + E 



Vn 



w 



l + whf 

which is non-decreasing and concave as shown in Fig. @] Similarly we write 



mm{Xa 2 w, 1}, 



(32) 



V2(w) = min < Act w,1 + E 



w 



(33) 



From the first part of the proof, 1 + E 



Vt 



1 + tu/tf, 

is non-decreasing and concave since Vi(w) is 



non-decreasing and concave. Hence, T^(io) is non-decreasing and concave since pointwise minimum of 
non-decreasing and concave functions are again non-decreasing and concave. We can show in the same 
way that V n (w) is non-decreasing and concave for n > 2, i.e., V(w) = Voo(w) is non-decreasing and 
concave. 

Next, we will show that V(w) is bounded. Assume that 



V(w) < min{AcrVc} = \a 2 wt {Xa 2 w < c} + ct {Xa 2 w>c} . 



(34) 



Then, from the definition of V(w) we have 1 + E V ( 1 _^ uh -2 J < c. Since V(w) is non-decreasing, 



V 



l+wh'i 



< E 



. From 









1 + E 


m. 


< l + E 



we can write 



Xa 2 



h 2 "{^< c i 



'Act 2 
P(_> c 



(35) 
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Vx(W) 



Xa 2 W 



Fig. 4. The function Vi(w) is non-decreasing and concave. 



hence 1 + E 



is also less than the right-hand side of (l35l l. Recalling 1 + E V yj^jj^j 



we want to find a c such that 



For such a c we have 



1 + E 



KcP 



+ c P 



"If 



> c < c. 



< c 



(36) 



Ao_ 2 



< c 



A<7 2 



-1, 



A^ 



_ /if {^r < c > 

E 



Act 2 



2\ +' 



(37) 



where (•)+ is the positive part operator. We need to show that there exists a c satisfying E 
1. Note that we can write 



A,r- ; 



> 



Act 2 ^ 
hi 



> E 



> E 



™1 
hi 

Act 2 



1 



1 



{hi>e} 



{hi>e} 



Act 2 



P(hi > e), 



(38) 



where ( c — 



->• oo as c — >• oo since A and e are constants. If P(/i 2 > e) > 0, which is always true 
except the trivial case where h\ = deterministically, then the desired c exists. 

Now, what remains is to justify our initial assumption V(w) < min{A<7 2 ?x;, c}. We will use induction 
to show that the assumption holds with the c found above. From (132K we have V\(w) = mm{\a 2 w, 1} < 
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mm{\a 2 w, c} since c > 1. Then, assume that 



V n -i{w) < mm{Xa 2 w,c} = \a 2 wt {Xa 2 w < c} + ct {Xa 2 w>c} . 



(39) 



We need to show that V n (w) < mm{\a 2 w, c}, where V n (w) = min |Acj 2 u;, 1 + E 



V 



n—1 



Note that 1 + E 



V„ 



from (|39l we have 



< 1 + E 



1+whj 

since V n -i(w) is non-decreasing. Similar to 



1 + E 



< 1 + E 



+ c P 



AO 2 



> c < c, 



where the last inequality follows from (|36l l. Hence, 

V^(w) < min{Acj 2 'u;, c}, Vn, 



(40) 



(41) 



showing that V(w) < mm{\a 2 w, c}, which is the assumption in (l34b . 

We showed that y(u>) is non-decreasing, concave and bounded if it exists, i.e., the limit lim n ^oo V n (w) 
exists. Note that we showed in (|4TI ) that the sequence {V n } is bounded. If we also show that {V n } is 
monotone, e.g., non-decreasing, then by monotone convergence theorem for real numbers {V n } converges 
to a finite limit V(w). We will again use induction to show monotonicity for {V n }. From (l32l i we 
write V\{w) = mm{Xa 2 w, 1} > Vq(w) = 0. Assuming V n -\{w) > V n -2(w) we need to show that 



V n (w) > V n -\(w). Using their definitions we write V n (w) = min jA<7 2 w, 1 + E 



V n -i(w) 



mm 



^Xa 2 w, 



1 + E 



Vn-1 



T+wKf 



}■ 



T+whl 



Vn- 

> 1+E 



l+whf 



and 



Vn 



™-2 [l+wh'j 



We have 1+E 

due to the assumption V n -\(w) > V n -2{w), hence V n (w) > V n -\(w). 

To conclude, we proved that V n (w) is non-decreasing and bounded in n, thus the limit V(w) exists, 
which was also shown to be non-decreasing, concave and bounded. Hence, G(w) is non-decreasing, 
concave and bounded. 
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